Honeybees are very important organisms to the world at large. Honeybees are important pollinators of flowering plants, and thus are essential to the agriculture industry. Despite their vital importance, honeybee populations have been declining for nearly thirty years. One of the main reasons for this decline in honeybee numbers is the Varroa mite (Varroa destructor), an external parasite of honeybees. Varroa mites are responsible for causing Varroosis in honeybees, weakening adult bees by feeding on their haemolymph and thereby increasing their susceptibility to other diseases. The current protocol in place to help rid honeybee colonies of Varroa mites is through miticides. However, this method of pest control is not sustainable as the use of miticides may further weaken the honeybees or contaminate hive products. More importantly, an increase in resistance to miticides by Varroa mites has been seen. Consequently, it is imperative to develop a new means of controlling these pests to increase honeybee populations in the long term.
This research aims to examine various odorants and their effects on both Varroa mites and honeybees to analyze their attractant or repellent properties on the two species. Additionally, the effects of the odorants on honeybee and Varroa mite behaviour, as well as effects on in-hive production, will be examined.
The outcome of this research project will be to implement the use of odorants as a more sustainable means of controlling Varroa mite populations, without the use of toxic pesticides. Successful implementation of the results of this study could help slow the decline of honeybee populations worldwide by decreasing the presence of Varroa mites in honeybee colonies.
Over the summer, Mike conducted electrotarsograms (ETGs) on Varroa mites to examine their physiological sensitivity to several previously identified attractive odourants.
ETGs on Varroa mites were conducted using single odourants previously identified as evoking a response in Varroa mites, and were tested in four different concentrations (10ng, 1ng, 100μg, 10μg). These are typical concentrations used in ETG studies and often encompass concentrations close to what Varroa mites would encounter in their natural environment. Relative to a control stimulus, these reactions can be used to infer the relative importance of each odourant to Varroa mites in the hive environment.
The questions we want to answer from the dataset are as follows:
1. Are there mite trials that are outliers?
2. Are there odours which consistantly evoke a response?
3. Are there concentration-dependent trends?
4. Can we quantify and account for between-trial (between-mite) variability?
my.df <- read_csv('DataForR.csv')
## Parsed with column specification:
## cols(
## trial = col_integer(),
## odour = col_character(),
## conc = col_double(),
## percent = col_double(),
## gb = col_character(),
## normalized = col_double()
## )
str(my.df)
## Classes 'tbl_df', 'tbl' and 'data.frame': 1165 obs. of 6 variables:
## $ trial : int 1 1 1 1 1 1 1 1 1 1 ...
## $ odour : chr "2-Hydroxyhexanoic Acid (DCM)" "2-Hydroxyhexanoic Acid (DCM)" "2-Hydroxyhexanoic Acid (DCM)" "Methyl Palmitate" ...
## $ conc : num 0.01 0.1 1 0.01 0 10 0.01 1 0.01 0.1 ...
## $ percent : num 53.6 142.5 157.9 137.7 100 ...
## $ gb : chr "Good" "Good" "Good" "Good" ...
## $ normalized: num 13.3 129.8 136.7 127.4 100 ...
## - attr(*, "spec")=List of 2
## ..$ cols :List of 6
## .. ..$ trial : list()
## .. .. ..- attr(*, "class")= chr "collector_integer" "collector"
## .. ..$ odour : list()
## .. .. ..- attr(*, "class")= chr "collector_character" "collector"
## .. ..$ conc : list()
## .. .. ..- attr(*, "class")= chr "collector_double" "collector"
## .. ..$ percent : list()
## .. .. ..- attr(*, "class")= chr "collector_double" "collector"
## .. ..$ gb : list()
## .. .. ..- attr(*, "class")= chr "collector_character" "collector"
## .. ..$ normalized: list()
## .. .. ..- attr(*, "class")= chr "collector_double" "collector"
## ..$ default: list()
## .. ..- attr(*, "class")= chr "collector_guess" "collector"
## ..- attr(*, "class")= chr "col_spec"
summary(my.df)
## trial odour conc percent
## Min. : 1.00 Length:1165 Min. : 0.000 Min. : 17.59
## 1st Qu.: 9.00 Class :character 1st Qu.: 0.010 1st Qu.: 78.07
## Median :13.00 Mode :character Median : 0.100 Median : 100.00
## Mean :12.22 Mean : 2.665 Mean : 117.17
## 3rd Qu.:16.00 3rd Qu.: 1.000 3rd Qu.: 125.02
## Max. :18.00 Max. :10.000 Max. :2507.58
## gb normalized
## Length:1165 Min. :-368.60
## Class :character 1st Qu.: 71.91
## Mode :character Median : 100.00
## Mean : 89.80
## 3rd Qu.: 120.01
## Max. : 196.01
hist(my.df$normalized,
xlab="Normalized ETG Response", ylab="Frequency",
main="Frequency of Normalized ETG Responses")
p <- ggplot(data=my.df, aes(normalized))
p + geom_density()
Doing this produces a histogram of the percent reaction by Varroa mites. This is an easy way to see how the data is distributed.
#change integer to factor for trials
my.df$trial<-factor(my.df$trial)
#lists the levels of factors
levels(my.df$trial)
## [1] "1" "2" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15"
## [15] "16" "17" "18"
#change number to factor for concentration "conc"
my.df$conc<-factor(my.df$conc)
levels(my.df$conc)
## [1] "0" "0.01" "0.1" "1" "10"
# types of odours
my.df$odour<-factor(my.df$odour)
levels(my.df$odour)
## [1] "1-Hexadecanol" "2-Heptanol"
## [3] "2-Heptanone" "2-Hydroxyhexanoic Acid (DCM)"
## [5] "2-Nonanal" "Benzoic Acid"
## [7] "Butyric Acid" "Dodecyl Aldehyde"
## [9] "Ethyl Palmitate" "Geraniol"
## [11] "Heptacosane" "Heptadecane"
## [13] "Hexadecanal" "Hexane"
## [15] "Methyl Oleate" "Methyl Palmitate"
## [17] "Octadecanol" "Octanoic Acid"
## [19] "Palmitic Acid" "trans-Nerolidol"
ggplot(my.df)+
geom_boxplot(aes(x=conc,y= (normalized - 100), group= conc))+
xlab("Concentration")+ylab("Response to Odour")+ggtitle("Faceted ETG Plots")+
theme_bw(10)+
facet_wrap(~odour)
We discovered that all runs with these odours were highly variable and produced messy plots, so we will remove them for now.
good_runs<- filter(my.df,
!(odour %in% c("1-Hexadecanol", "Methyl Palmitate", "Hexane", "Palmitic Acid" )))
subsetted <- mutate(good_runs,
Zresponse = normalized - 100)
hist(subsetted$Zresponse,xlab="Normalized ETG Response",ylab="Frequency", main="Frequency of Normalized ETG Responses")
A negative response makes no sense physiologically, so this is an indication of a bad run. We remove these negative values from analysis by making them NA.
nonegatives <- subsetted %>%
mutate(Zresponse= ifelse(Zresponse <0, NA, Zresponse))
str(nonegatives)
## Classes 'tbl_df', 'tbl' and 'data.frame': 706 obs. of 7 variables:
## $ trial : Factor w/ 17 levels "1","2","4","5",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ odour : Factor w/ 20 levels "1-Hexadecanol",..: 4 4 4 15 12 12 12 2 5 5 ...
## $ conc : Factor w/ 5 levels "0","0.01","0.1",..: 2 3 4 5 2 3 4 3 2 3 ...
## $ percent : num 53.6 142.5 157.9 202.7 44.1 ...
## $ gb : chr "Good" "Good" "Good" "Good" ...
## $ normalized: num 13.3 129.8 136.7 150.7 -27 ...
## $ Zresponse : num NA 29.8 36.7 50.7 NA ...
Now that we have manipulated the data in a way that is useful to us, we will make our faceted plot.
ggplot(nonegatives)+
geom_boxplot(aes(x=conc,y= Zresponse, group= conc))+
xlab("Concentration")+ylab("Normalized Percent Reaction")+ggtitle("Faceted ETG Plots")+
theme_bw(10)+
facet_wrap(~odour)
## Warning: Removed 366 rows containing non-finite values (stat_boxplot).
From this plot, we can see the overall trend in ETG responses from Varroa mites at each concentration for each compound tested.
We are still waiting on better temperature and humidity data. We have been in touch with someone regarding this.
absoluteweather<- read.csv('absoluteweather.csv')
library(tidyverse)
absoluteweather$conc <- factor(absoluteweather$conc)
absoluteweather$trial <- factor(absoluteweather$trial)
str(absoluteweather)
## 'data.frame': 1165 obs. of 13 variables:
## $ trial : Factor w/ 17 levels "1","2","4","5",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ odour : Factor w/ 20 levels "1-Hexadecanol",..: 4 4 4 16 14 15 1 16 12 12 ...
## $ conc : Factor w/ 5 levels "0","0.01","0.1",..: 2 3 4 2 1 5 2 4 2 3 ...
## $ absamp : num 3.02 8.09 9 7.95 5.91 ...
## $ linearhex : num 5.64 5.67 5.7 5.77 5.91 ...
## $ percent : num 53.6 142.5 157.9 137.7 100 ...
## $ normalized: num 13.3 129.8 136.7 127.4 100 ...
## $ gb : Factor w/ 2 levels "Bad","Good": 2 2 2 2 2 2 2 2 2 2 ...
## $ humid : num 69 69 69.1 69.2 69.3 ...
## $ temp : num 20.8 20.8 20.8 20.7 20.7 ...
## $ dewp : num 14.9 14.9 14.9 14.9 14.9 ...
## $ hour : num 2013 2014 2014 2016 2019 ...
## $ min : num 13.1 13.8 14.5 16 18.9 ...
ggplot(data = absoluteweather, mapping = aes(humid, absamp)) +
geom_jitter(aes(colour = absoluteweather$trial), width = 0.25) +
ggtitle("Amplitude vs Humidity")
ggplot(data = absoluteweather, mapping = aes(temp, absamp)) +
geom_jitter(aes(colour = absoluteweather$trial), width = 0.25) +
ggtitle("Amplitude vs Temperature")
ggplot(data = absoluteweather, mapping = aes(dewp, absamp)) +
geom_jitter(aes(colour = absoluteweather$trial), width = 0.25) +
ggtitle("Amplitude vs Dewpoint")
ggplot(data = absoluteweather, mapping = aes(absamp))+
geom_histogram(bins = 50)
# correlations
cor(absoluteweather$absamp, absoluteweather$temp)
## [1] -0.4868699
cov(absoluteweather$absamp, absoluteweather$temp)
## [1] -234.6016
corr_amp_tempr <- cor.test(x=absoluteweather$absamp,
y=absoluteweather$temp, method = 'spearman')
## Warning in cor.test.default(x = absoluteweather$absamp, y = absoluteweather
## $temp, : Cannot compute exact p-value with ties
corr_amp_tempr
##
## Spearman's rank correlation rho
##
## data: absoluteweather$absamp and absoluteweather$temp
## S = 367540000, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## -0.3946761
corr_amp_humid <- cor.test(x=absoluteweather$absamp,
y=absoluteweather$humid, method = 'spearman')
## Warning in cor.test.default(x = absoluteweather$absamp, y = absoluteweather
## $humid, : Cannot compute exact p-value with ties
corr_amp_humid
##
## Spearman's rank correlation rho
##
## data: absoluteweather$absamp and absoluteweather$humid
## S = 103650000, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## 0.606672
corr_amp_dewp<- cor.test(x=absoluteweather$absamp,
y=absoluteweather$dewp, method = 'spearman')
## Warning in cor.test.default(x = absoluteweather$absamp, y = absoluteweather
## $dewp, : Cannot compute exact p-value with ties
corr_amp_dewp
##
## Spearman's rank correlation rho
##
## data: absoluteweather$absamp and absoluteweather$dewp
## S = 195510000, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## 0.2580989
library(fifer)
## Loading required package: MASS
##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
spearman.plot(absoluteweather$absamp, absoluteweather$temp)
weather_matrix <- data.matrix(absoluteweather[7:10], rownames.force = NA)
pairs(weather_matrix) # plots all the stuff simultaneiously
absoluteweather <- read.csv('absoluteweather.csv')
str(absoluteweather)
## 'data.frame': 1165 obs. of 13 variables:
## $ trial : int 1 1 1 1 1 1 1 1 1 1 ...
## $ odour : Factor w/ 20 levels "1-Hexadecanol",..: 4 4 4 16 14 15 1 16 12 12 ...
## $ conc : num 0.01 0.1 1 0.01 0 10 0.01 1 0.01 0.1 ...
## $ absamp : num 3.02 8.09 9 7.95 5.91 ...
## $ linearhex : num 5.64 5.67 5.7 5.77 5.91 ...
## $ percent : num 53.6 142.5 157.9 137.7 100 ...
## $ normalized: num 13.3 129.8 136.7 127.4 100 ...
## $ gb : Factor w/ 2 levels "Bad","Good": 2 2 2 2 2 2 2 2 2 2 ...
## $ humid : num 69 69 69.1 69.2 69.3 ...
## $ temp : num 20.8 20.8 20.8 20.7 20.7 ...
## $ dewp : num 14.9 14.9 14.9 14.9 14.9 ...
## $ hour : num 2013 2014 2014 2016 2019 ...
## $ min : num 13.1 13.8 14.5 16 18.9 ...
absoluteweather$trial <- as.factor(absoluteweather$trial)
absoluteweather$conc <- as.factor(absoluteweather$conc)
subsetted <- mutate(absoluteweather,
Zresponse = normalized - 100)
nonegatives <- subsetted %>%
mutate(Zresponse= ifelse(Zresponse <0, NA, Zresponse))
str(nonegatives$Zresponse)
## num [1:1165] NA 29.8 36.7 27.4 0 ...
ggplot(data = nonegatives, aes(min, absamp))+
geom_jitter(aes(colour = nonegatives$odour), width = 0.25)+
xlab("Time")+ylab("Absamp")+ggtitle("Faceted ETG Plots")+
theme_bw(10)+
facet_wrap(~trial)
goodtrials <- nonegatives[nonegatives$trial %in% c(1:8,11, 12), ]
str(goodtrials)
## 'data.frame': 453 obs. of 14 variables:
## $ trial : Factor w/ 17 levels "1","2","4","5",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ odour : Factor w/ 20 levels "1-Hexadecanol",..: 4 4 4 16 14 15 1 16 12 12 ...
## $ conc : Factor w/ 5 levels "0","0.01","0.1",..: 2 3 4 2 1 5 2 4 2 3 ...
## $ absamp : num 3.02 8.09 9 7.95 5.91 ...
## $ linearhex : num 5.64 5.67 5.7 5.77 5.91 ...
## $ percent : num 53.6 142.5 157.9 137.7 100 ...
## $ normalized: num 13.3 129.8 136.7 127.4 100 ...
## $ gb : Factor w/ 2 levels "Bad","Good": 2 2 2 2 2 2 2 2 2 2 ...
## $ humid : num 69 69 69.1 69.2 69.3 ...
## $ temp : num 20.8 20.8 20.8 20.7 20.7 ...
## $ dewp : num 14.9 14.9 14.9 14.9 14.9 ...
## $ hour : num 2013 2014 2014 2016 2019 ...
## $ min : num 13.1 13.8 14.5 16 18.9 ...
## $ Zresponse : num NA 29.8 36.7 27.4 0 ...
remove(subsetted)
ggplot(data = nonegatives, mapping = aes(Zresponse))+
geom_histogram(bins = 50)
## Warning: Removed 577 rows containing non-finite values (stat_bin).
plot(aov_out <- aov(Zresponse~odour*conc*trial, data = goodtrials))
## Warning: not plotting observations with leverage one:
## 1, 2, 5, 6, 8, 12, 13, 14, 19, 21, 22, 23, 24, 25, 26, 28, 31, 32, 33, 38, 39, 40, 43, 45, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 64, 65, 67, 68, 69, 70, 71, 85, 86, 87, 88, 89, 90, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 116, 117, 118, 119, 120, 125, 126, 127, 128, 129, 131, 132, 136, 137, 140, 141, 146, 153, 157, 158, 159, 162, 165, 166, 167, 168, 175, 176, 177, 178, 179, 181, 182, 183, 184, 185, 189, 190, 191, 192, 193, 194, 195, 196, 197, 203, 204, 205, 206, 207, 208, 209, 210, 215, 216, 217, 220
## Warning: not plotting observations with leverage one:
## 1, 2, 5, 6, 8, 12, 13, 14, 19, 21, 22, 23, 24, 25, 26, 28, 31, 32, 33, 38, 39, 40, 43, 45, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 64, 65, 67, 68, 69, 70, 71, 85, 86, 87, 88, 89, 90, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 116, 117, 118, 119, 120, 125, 126, 127, 128, 129, 131, 132, 136, 137, 140, 141, 146, 153, 157, 158, 159, 162, 165, 166, 167, 168, 175, 176, 177, 178, 179, 181, 182, 183, 184, 185, 189, 190, 191, 192, 193, 194, 195, 196, 197, 203, 204, 205, 206, 207, 208, 209, 210, 215, 216, 217, 220
## Warning in sqrt(crit * p * (1 - hh)/hh): NaNs produced
## Warning in sqrt(crit * p * (1 - hh)/hh): NaNs produced
summary(aov_out)
## Df Sum Sq Mean Sq F value Pr(>F)
## odour 18 34745 1930 6.580 3.18e-09 ***
## conc 3 770 257 0.875 0.4581
## trial 8 36778 4597 15.671 4.95e-13 ***
## odour:conc 36 8820 245 0.835 0.7195
## odour:trial 46 20590 448 1.526 0.0538 .
## conc:trial 22 5605 255 0.869 0.6331
## odour:conc:trial 20 3987 199 0.680 0.8329
## Residuals 71 20829 293
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 228 observations deleted due to missingness
library(tidyverse)
library(modelr)
options(na.action = na.warn)
ggplot(goodtrials, aes(trial, absamp))+
geom_point(aes(colour=odour))
That plot looks too spread out, so let’s log transform the data and try again
goodtrials$logging <- log(goodtrials$absamp)
ggplot(goodtrials, aes(trial, logging))+
geom_jitter(aes(colour=odour))+
xlab("Trial")+ylab("Log(Absolute Amplitude of Response)")
gucci<- filter(goodtrials,
!(odour %in% c("1-Hexadecanol", "Methyl Palmitate", "Hexane", "Palmitic Acid" )))
These aren’t count data, so we don’t anticipate the poisson model to fit very well.
mod01 <- glm(I(as.integer(gucci$Zresponse))~conc+odour, data = gucci, poisson)
## Warning: Dropping 160 rows with missing values
par(mfrow=c(2, 2))
summary(mod01)
##
## Call:
## glm(formula = I(as.integer(gucci$Zresponse)) ~ conc + odour,
## family = poisson, data = gucci)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -8.3223 -3.4985 -0.7304 2.4474 8.2475
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.92806 0.07993 36.633 < 2e-16 ***
## conc0.1 0.12707 0.05345 2.377 0.01744 *
## conc1 0.22926 0.05261 4.358 1.31e-05 ***
## conc10 -0.04868 0.05964 -0.816 0.41436
## odour2-Heptanone 0.25672 0.11962 2.146 0.03186 *
## odour2-Hydroxyhexanoic Acid (DCM) 0.24729 0.09133 2.708 0.00678 **
## odour2-Nonanal 0.35450 0.08657 4.095 4.22e-05 ***
## odourBenzoic Acid -0.69356 0.13930 -4.979 6.40e-07 ***
## odourButyric Acid -0.50406 0.16233 -3.105 0.00190 **
## odourDodecyl Aldehyde 0.03866 0.14195 0.272 0.78532
## odourEthyl Palmitate -0.17133 0.27055 -0.633 0.52656
## odourGeraniol -0.71717 0.16899 -4.244 2.20e-05 ***
## odourHeptacosane -0.03562 0.14921 -0.239 0.81134
## odourHeptadecane 0.52008 0.08242 6.310 2.79e-10 ***
## odourHexadecanal 0.60396 0.08908 6.780 1.20e-11 ***
## odourMethyl Oleate 0.66536 0.08303 8.013 1.12e-15 ***
## odourOctadecanol -0.57679 0.32639 -1.767 0.07719 .
## odourOctanoic Acid -0.05128 0.10461 -0.490 0.62398
## odourtrans-Nerolidol -0.62392 0.16655 -3.746 0.00018 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 2142.0 on 108 degrees of freedom
## Residual deviance: 1696.7 on 90 degrees of freedom
## (160 observations deleted due to missingness)
## AIC: 2228.1
##
## Number of Fisher Scoring iterations: 5
mod02 <- glm(Zresponse~conc+odour, data = gucci)
## Warning: Dropping 160 rows with missing values
par(mfrow=c(2, 2))
summary(mod02)
##
## Call:
## glm(formula = Zresponse ~ conc + odour, data = gucci)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -36.565 -14.599 -2.835 11.036 56.098
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 18.4157 8.6300 2.134 0.0356 *
## conc0.1 3.4408 6.2746 0.548 0.5848
## conc1 6.3403 6.3140 1.004 0.3180
## conc10 -0.9955 6.7414 -0.148 0.8829
## odour2-Heptanone 6.3594 13.5516 0.469 0.6400
## odour2-Hydroxyhexanoic Acid (DCM) 6.1394 10.0809 0.609 0.5440
## odour2-Nonanal 8.9422 9.6888 0.923 0.3585
## odourBenzoic Acid -9.5201 11.3728 -0.837 0.4048
## odourButyric Acid -7.1927 13.6944 -0.525 0.6007
## odourDodecyl Aldehyde 0.5698 15.0625 0.038 0.9699
## odourEthyl Palmitate -1.9607 24.3939 -0.080 0.9361
## odourGeraniol -11.1482 13.5336 -0.824 0.4123
## odourHeptacosane -0.6370 15.2993 -0.042 0.9669
## odourHeptadecane 13.9834 9.4044 1.487 0.1405
## odourHexadecanal 17.0364 10.6542 1.599 0.1133
## odourMethyl Oleate 19.4371 9.8322 1.977 0.0511 .
## odourOctadecanol -7.2124 24.3939 -0.296 0.7682
## odourOctanoic Acid -1.1539 10.9092 -0.106 0.9160
## odourtrans-Nerolidol -9.5458 13.6894 -0.697 0.4874
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 521.8397)
##
## Null deviance: 57674 on 108 degrees of freedom
## Residual deviance: 46966 on 90 degrees of freedom
## (160 observations deleted due to missingness)
## AIC: 1010.5
##
## Number of Fisher Scoring iterations: 2
plot(mod02)
## Warning: not plotting observations with leverage one:
## 140, 217
## Warning: not plotting observations with leverage one:
## 140, 217
This model appears to fit the data better than the poisson model.
These data aren’t binomial, but we wanted to fit a binomial model anyway just to see what it would look like.
bindf <- gucci %>%
mutate(BinZresponse= ifelse(Zresponse > 0, 1, 0))
mod03 <- glm(bindf$BinZresponse~conc+odour, data = bindf, binomial)
## Warning: Dropping 160 rows with missing values
## Warning: glm.fit: algorithm did not converge
par(mfrow=c(2, 2))
summary(mod03)
##
## Call:
## glm(formula = bindf$BinZresponse ~ conc + odour, family = binomial,
## data = bindf)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## 2.409e-06 2.409e-06 2.409e-06 2.409e-06 2.409e-06
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.657e+01 1.345e+05 0 1
## conc0.1 1.045e-08 9.782e+04 0 1
## conc1 9.134e-08 9.843e+04 0 1
## conc10 2.327e-07 1.051e+05 0 1
## odour2-Heptanone 2.105e-07 2.113e+05 0 1
## odour2-Hydroxyhexanoic Acid (DCM) 6.163e-07 1.572e+05 0 1
## odour2-Nonanal 2.031e-07 1.510e+05 0 1
## odourBenzoic Acid -3.005e-07 1.773e+05 0 1
## odourButyric Acid 1.590e-07 2.135e+05 0 1
## odourDodecyl Aldehyde -1.590e-07 2.348e+05 0 1
## odourEthyl Palmitate 4.537e-06 3.803e+05 0 1
## odourGeraniol -3.584e-07 2.110e+05 0 1
## odourHeptacosane 2.903e-07 2.385e+05 0 1
## odourHeptadecane 2.075e-07 1.466e+05 0 1
## odourHexadecanal -1.734e-07 1.661e+05 0 1
## odourMethyl Oleate 2.105e-07 1.533e+05 0 1
## odourOctadecanol 4.537e-06 3.803e+05 0 1
## odourOctanoic Acid 2.498e-07 1.701e+05 0 1
## odourtrans-Nerolidol 2.402e-07 2.134e+05 0 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 0.0000e+00 on 108 degrees of freedom
## Residual deviance: 6.3237e-10 on 90 degrees of freedom
## (160 observations deleted due to missingness)
## AIC: 38
##
## Number of Fisher Scoring iterations: 25
plot(mod03)
## Warning: not plotting observations with leverage one:
## 140, 217
## Warning: not plotting observations with leverage one:
## 140, 217
This model appears to fit the data terribly
library(lme4)
## Loading required package: Matrix
##
## Attaching package: 'Matrix'
## The following object is masked from 'package:tidyr':
##
## expand
library(modelr)
Next, run the model and take random trial effects into account. Since a gaussian glm fit our data the best out of all the glms we ran, we will assume a gaussian family for glmm as well.
Run several models including different combinations of variables to see which fits the data best.
glmm1 <- glmer(Zresponse ~ -1 + odour + (1|trial), data = gucci)
## Warning in glmer(Zresponse ~ -1 + odour + (1 | trial), data = gucci):
## calling glmer() with family=gaussian (identity link) as a shortcut to
## lmer() is deprecated; please call lmer() directly
## Warning: Dropping 160 rows with missing values
## Warning: Dropping 160 rows with missing values
## Warning: Dropping 160 rows with missing values
glmm2 <- glmer(Zresponse ~ -1 + odour + conc + (1|trial), data = gucci)
## Warning in glmer(Zresponse ~ -1 + odour + conc + (1 | trial), data =
## gucci): calling glmer() with family=gaussian (identity link) as a shortcut
## to lmer() is deprecated; please call lmer() directly
## Warning in glmer(Zresponse ~ -1 + odour + conc + (1 | trial), data =
## gucci): Dropping 160 rows with missing values
## Warning in glmer(Zresponse ~ -1 + odour + conc + (1 | trial), data =
## gucci): Dropping 160 rows with missing values
## Warning in glmer(Zresponse ~ -1 + odour + conc + (1 | trial), data =
## gucci): Dropping 160 rows with missing values
glmm3 <- glmer(Zresponse ~ -1 + odour + conc + gb + (1|trial), data = gucci)
## Warning in glmer(Zresponse ~ -1 + odour + conc + gb + (1 | trial), data =
## gucci): calling glmer() with family=gaussian (identity link) as a shortcut
## to lmer() is deprecated; please call lmer() directly
## Warning in glmer(Zresponse ~ -1 + odour + conc + gb + (1 | trial), data =
## gucci): Dropping 160 rows with missing values
## Warning in glmer(Zresponse ~ -1 + odour + conc + gb + (1 | trial), data =
## gucci): Dropping 160 rows with missing values
## Warning in glmer(Zresponse ~ -1 + odour + conc + gb + (1 | trial), data =
## gucci): Dropping 160 rows with missing values
glmm4 <- glmer(Zresponse ~ -1 + odour * conc + (1|trial), data = gucci)
## Warning in glmer(Zresponse ~ -1 + odour * conc + (1 | trial), data =
## gucci): calling glmer() with family=gaussian (identity link) as a shortcut
## to lmer() is deprecated; please call lmer() directly
## Warning in glmer(Zresponse ~ -1 + odour * conc + (1 | trial), data =
## gucci): Dropping 160 rows with missing values
## Warning in glmer(Zresponse ~ -1 + odour * conc + (1 | trial), data =
## gucci): Dropping 160 rows with missing values
## Warning in glmer(Zresponse ~ -1 + odour * conc + (1 | trial), data =
## gucci): Dropping 160 rows with missing values
## fixed-effect model matrix is rank deficient so dropping 13 columns / coefficients
glmm5 <- glmer(Zresponse ~ -1 + conc + (1|trial), data = gucci)
## Warning in glmer(Zresponse ~ -1 + conc + (1 | trial), data = gucci):
## calling glmer() with family=gaussian (identity link) as a shortcut to
## lmer() is deprecated; please call lmer() directly
## Warning in glmer(Zresponse ~ -1 + conc + (1 | trial), data = gucci):
## Dropping 160 rows with missing values
## Warning in glmer(Zresponse ~ -1 + conc + (1 | trial), data = gucci):
## Dropping 160 rows with missing values
## Warning in glmer(Zresponse ~ -1 + conc + (1 | trial), data = gucci):
## Dropping 160 rows with missing values
plot(glmm1)
plot(glmm2)
plot(glmm3)
plot(glmm4)
plot(glmm5)
grid2 <- gucci %>%
data_grid(data= gucci, odour,conc) %>%
add_predictions(glmm2)
ggplot(gucci, aes(x = conc)) +
geom_jitter(aes(y = Zresponse)) +
geom_boxplot(data = grid2, aes(y = pred), colour = "purple", size = 0.2)+
facet_wrap (~odour)
## Warning: Removed 160 rows containing missing values (geom_point).
grid3 <- gucci %>%
data_grid(data= gucci, odour,conc) %>%
add_predictions(glmm3)
ggplot(gucci, aes(x = conc)) +
geom_jitter(aes(y = Zresponse)) +
geom_boxplot(data = grid3, aes(y = pred), colour = "red", size = 0.2)+
facet_wrap (~odour)
## Warning: Removed 160 rows containing missing values (geom_point).
grid4 <- gucci %>%
data_grid(data= gucci, odour,conc) %>%
add_predictions(glmm4)
ggplot(gucci, aes(x = conc)) +
geom_jitter(aes(y = Zresponse)) +
geom_boxplot(data = grid4, aes(y = pred), colour = "orange", size = 0.2)+
facet_wrap (~odour)
## Warning: Removed 160 rows containing missing values (geom_point).
grid5 <- gucci %>%
data_grid(data= gucci, odour,conc) %>%
add_predictions(glmm5)
ggplot(gucci, aes(x = conc)) +
geom_jitter(aes(y = Zresponse)) +
geom_boxplot(data = grid5, aes(y = pred), colour = "blue", size = 0.2)+
facet_wrap (~odour)
## Warning: Removed 160 rows containing missing values (geom_point).
Test, using an ANOVA, which model best fits the data
anova(glmm1, glmm2, glmm3, glmm4, glmm5)
## refitting model(s) with ML (instead of REML)
## Data: gucci
## Models:
## glmm5: Zresponse ~ -1 + conc + (1 | trial)
## glmm1: Zresponse ~ -1 + odour + (1 | trial)
## glmm2: Zresponse ~ -1 + odour + conc + (1 | trial)
## glmm3: Zresponse ~ -1 + odour + conc + gb + (1 | trial)
## glmm4: Zresponse ~ -1 + odour * conc + (1 | trial)
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## glmm5 6 967.43 983.58 -477.72 955.43
## glmm1 18 970.31 1018.75 -467.15 934.31 21.1232 12 0.04860 *
## glmm2 21 972.64 1029.16 -465.32 930.64 3.6719 3 0.29914
## glmm3 22 974.52 1033.73 -465.26 930.52 0.1148 1 0.73470
## glmm4 53 993.35 1135.99 -443.67 887.35 43.1760 31 0.07176 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
gaga <- gucci[gucci$trial %in% c(4,5,6,11,12), ]
lasttry<- filter(gaga,
!(odour %in% c("1-Hexadecanol", "Methyl Palmitate", "Hexane", "Palmitic Acid", "Ethyl Palmitate" )))
ggplot(lasttry, aes(trial, log(absamp)))+
geom_jitter(aes(colour=odour))
guccigagaGLMM <- glmer(Zresponse ~ -1 + odour + (1|trial), data = lasttry)
## Warning in glmer(Zresponse ~ -1 + odour + (1 | trial), data = lasttry):
## calling glmer() with family=gaussian (identity link) as a shortcut to
## lmer() is deprecated; please call lmer() directly
## Warning: Dropping 91 rows with missing values
## Warning: Dropping 91 rows with missing values
## Warning: Dropping 91 rows with missing values
plot(guccigagaGLMM)
gridded <- lasttry %>%
data_grid(data= lasttry, odour,conc) %>%
add_predictions(guccigagaGLMM)
ggplot(lasttry, aes(x = conc)) +
geom_jitter(aes(y = Zresponse)) +
geom_boxplot(data = gridded, aes(y = pred), colour = "deeppink", size = 0.2)+
facet_wrap (~odour)
## Warning: Removed 91 rows containing missing values (geom_point).
summary(guccigagaGLMM)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Zresponse ~ -1 + odour + (1 | trial)
## Data: lasttry
##
## REML criterion at convergence: 371.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.52803 -0.63084 0.03166 0.47816 2.27990
##
## Random effects:
## Groups Name Variance Std.Dev.
## trial (Intercept) 49.3 7.021
## Residual 162.6 12.753
## Number of obs: 59, groups: trial, 5
##
## Fixed effects:
## Estimate Std. Error t value
## odour2-Heptanol 16.401 9.962 1.646
## odour2-Heptanone 27.169 7.691 3.532
## odour2-Hydroxyhexanoic Acid (DCM) 8.654 8.295 1.043
## odour2-Nonanal 14.559 6.063 2.401
## odourBenzoic Acid 11.908 6.611 1.801
## odourButyric Acid 17.224 8.360 2.060
## odourDodecyl Aldehyde 22.111 8.527 2.593
## odourGeraniol 10.054 8.527 1.179
## odourHeptacosane 21.445 8.360 2.565
## odourHeptadecane 20.477 6.063 3.377
## odourHexadecanal 8.674 7.719 1.124
## odourMethyl Oleate 14.911 7.062 2.111
## odourOctadecanol 10.405 13.459 0.773
## odourOctanoic Acid 6.821 7.719 0.884
## odourtrans-Nerolidol 14.157 7.845 1.805
##
## Correlation matrix not shown by default, as p = 15 > 12.
## Use print(x, correlation=TRUE) or
## vcov(x) if you need it